Skip to content

Conversation

@jorgee
Copy link
Contributor

@jorgee jorgee commented Dec 1, 2025

Alternative for #5089

This pull request refactors revision and commit management for pipeline assets, streamlining how revisions are tracked and handled across CLI commands and internal logic. The main changes include removing the legacy revision map mechanism, updating CLI flags for clarity and consistency, and improving concurrency safety when cloning repositories. These updates simplify the codebase and improve reliability when managing pipeline versions.

Revision and Commit Management Refactor

  • Removed the legacy revision map system (REVISION_MAP, DEFAULT_REVISION_DIRNAME) from AssetManager, including related methods (getRevisionMap, revisionToCommitWithMap, updateRevisionMap, etc.). Revision tracking now relies directly on branch/tag and commit information from the repository. [1] [2] [3] [4] [5]
  • Updated methods for listing revisions and commits to use direct repository data, improving accuracy and reducing code complexity. [1] [2] [3]

CLI Improvements

  • Changed the -a, -all-revisions flag to -a, -all in CmdDrop for clarity, and updated help descriptions accordingly.
  • Removed the -d, -deep flag from CmdPull and CmdRun, and marked it as deprecated in documentation for future removal. [1] [2] [3] [4]

Concurrency and Reliability

  • Added a file mutex mechanism in AssetManager.createSharedClone to prevent concurrent clones of the same commit, ensuring safe and reliable asset downloads. [1] [2]
  • Improved error handling during clone operations to clean up incomplete clones if an error occurs.

Codebase Cleanup

  • Removed unused imports and code related to the old revision map system from CLI command files (CmdDrop.groovy, CmdList.groovy, CmdPull.groovy). [1] [2] [3]
  • Updated CLI output formatting for revision and commit listing to reflect new logic and improve readability.

These changes collectively modernize revision tracking and asset management, making the system more robust and maintainable.

marcodelapierre and others added 30 commits January 15, 2024 17:29
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
… operation

Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
…f "master"

Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Dr Marco De La Pierre <[email protected]>
Signed-off-by: Dr Marco De La Pierre <[email protected]>
Signed-off-by: Dr Marco De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
marcodelapierre and others added 15 commits June 25, 2024 14:33
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
Signed-off-by: Dr Marco Claudio De La Pierre <[email protected]>
…ate method ; plus related updates across codebase
…re repo for default branch get branches and tags

Signed-off-by: jorgee <[email protected]>
Signed-off-by: jorgee <[email protected]>
@jorgee
Copy link
Contributor Author

jorgee commented Dec 1, 2025

Find bellow the directory tree for a project with the bare repo and two revisions (commit 451ebdand and master branch).

  • They are under .nexflow subdirectory, so they can co-exist with old-version clones
  • bare_repo contains the bare local copy (.git) of the remote repo.
  • 'commits' contains the shared clones per pulled revision. Each commit folder stores the worktree and a minimum .git folder, to store current branch metadata and possible changes. Objects are not replicated, they are linked to bare repo objects via objects/info/alternates.
tree -a
.
└── .nextflow
    ├── bare_repo
    │   ├── branches
    │   ├── config
    │   ├── FETCH_HEAD
    │   ├── HEAD
    │   ├── hooks
    │   ├── logs
    │   │   └── refs
    │   │       └── heads
    │   ├── objects
    │   │   ├── info
    │   │   └── pack
    │   │       ├── pack-84754fe8dbf547352f163823af8f47a8c7b7df7f.idx
    │   │       └── pack-84754fe8dbf547352f163823af8f47a8c7b7df7f.pack
    │   ├── packed-refs
    │   └── refs
    │       ├── heads
    │       └── tags
    └── commits
        ├── 451ebd9dcb56045d80963945305811aa65f413d0
        │   ├── .git
        │   │   ├── branches
        │   │   ├── config
        │   │   ├── FETCH_HEAD
        │   │   ├── HEAD
        │   │   ├── hooks
        │   │   ├── index
        │   │   ├── logs
        │   │   │   ├── HEAD
        │   │   │   └── refs
        │   │   │       └── heads
        │   │   ├── objects 
        │   │   │   ├── info
        │   │   │   │   └── alternates
        │   │   │   └── pack
        │   │   ├── packed-refs
        │   │   └── refs
        │   │       ├── heads
        │   │       └── tags
        │   ├── LICENSE
        │   ├── main.nf
        │   └── README.md
        └── 7588c46ffefb4e3c06d4ab32c745c4d5e56cdad8
            ├── .git
            │   ├── branches
            │   ├── config
            │   ├── FETCH_HEAD
            │   ├── HEAD
            │   ├── hooks
            │   ├── index
            │   ├── logs
            │   │   └── refs
            │   │       └── heads
            │   │           └── master
            │   ├── objects
            │   │   ├── info
            │   │   │   └── alternates
            │   │   └── pack
            │   ├── packed-refs
            │   └── refs
            │       ├── heads
            │       └── tags
            ├── .gitignore
            ├── LICENSE
            ├── main.nf
            ├── nextflow.config
            └── README.md

Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can briefly summarise what's the benefits of this approach vs revisions map ?

Comment on lines 60 to 64
List<AssetManager> dropList = []
if ( allRevisions ) {
def revManager = new AssetManager(args[0])
revManager.listRevisions().each { rev ->
if( rev == DEFAULT_REVISION_DIRNAME )
rev = null
if( !revManager.localRootPath.exists() ) {
throw new AbortOperationException("No match found for: ${revManager.getProjectWithRevision()}")
}
revManager.listCommits().each { rev ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try reducing the cyclomatic complexity of this method, splitting in smaller self-describing private methods.

Comment on lines 61 to 92
if( moreDetailed )
detailed = true
if( detailed && allRevisions ) {
all.each {
println(" $it")
def revManager = new AssetManager(it)
revManager.listRevisionsAndCommits().each { k, v ->
if( !moreDetailed )
v = v.substring(0, 10)
println(" $v $k")
}
}
} else if( allRevisions ) {
all.each {
println(" $it")
def revManager = new AssetManager(it)
revManager.listRevisions().each {
println(" $it")
}
}
} else if( allCommits ) {
all.each {
println(" $it")
def revManager = new AssetManager(it)
revManager.listCommits().each { println(" $it") }
}
} else {
all.each { println(" $it") }
}
} else if( allCommits ) {
all.each{
println(" $it")
def revManager = new AssetManager(it)
revManager.listCommits().each{ println(" $it") }
}
} else {
all.each{ println(" $it") }
}
}

}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, try to make this logic more compact following "Reveal intention" pattern. https://claude.ai/share/32835bb7-8395-431e-a73f-53797de9aaf6

@jorgee
Copy link
Contributor Author

jorgee commented Dec 2, 2025

The main benefit is removing the revisions map. It reduces the code to maintain as we removed all the functions to create, update, prune the map for every operation. It also avoids information duplication. What was stored in the map is currently in the bare repo and with per-revision fetch, there is no difference in the behavior. The only case were it is less efficient is in the detailed list where it iterates the repo branches and tags instead of iterating through the map entries. However, this detailed list doesn't exist in the master and it seems a replication of 'info'. I think we could remove it.

@pditommaso
Copy link
Member

Agree on keep a single source of truth and removing duplicate commands

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

5 participants